Goto

Collaborating Authors

 simultaneous gradient ascent



Reviews: The Numerics of GANs

Neural Information Processing Systems

This paper presents a novel analysis of the typical optimization algorithm used in GANs (simultaneous gradient ascent) and identifies problematic failures when the Jacobian has large imaginary components or zero real components. Motivated by these failures, they present a novel consensus optimization algorithm for training GANs. The consensus optimization is validated on a toy MoG dataset as well as CIFAR-10 and CelebA in terms of sample quality and inception score. I found this paper enjoyable to read and the results compelling. My primary concern is the lack of hyperparameter search when comparing optimization algorithms and lack of evidence that the problems identified with simultaneous gradient ascent are truly problems in practice.


4588e674d3f0faf985047d4c3f13ed0d-Paper.pdf

Neural Information Processing Systems

In this paper, we analyze the numerics of common algorithms for training Generative Adversarial Networks (GANs). Using the formalism of smooth two-player games we analyze the associated gradient vector field of GAN training objectives. Our findings suggest that the convergence of current algorithms suffers due to two factors: i) presence of eigenvalues of the Jacobian of the gradient vector field with zero real-part, and ii) eigenvalues with big imaginary part. Using these findings, we design a new algorithm that overcomes some of these limitations and has better convergence properties. Experimentally, we demonstrate its superiority on training common GAN architectures and show convergence on GAN architectures that are known to be notoriously hard to train.


Smooth markets: A basic mechanism for organizing gradient-based learners

Balduzzi, David, Czarnecki, Wojciech M, Anthony, Thomas W, Gemp, Ian M, Hughes, Edward, Leibo, Joel Z, Piliouras, Georgios, Graepel, Thore

arXiv.org Artificial Intelligence

With the success of modern machine learning, it is becoming increasingly important to understand and control how learning algorithms interact. Unfortunately, negative results from game theory show there is little hope of understanding or controlling general n-player games. We therefore introduce smooth markets (SM-games), a class of n-player games with pairwise zero sum interactions. SM-games codify a common design pattern in machine learning that includes (some) GANs, adversarial training, and other recent algorithms. We show that SM-games are amenable to analysis and optimization using first-order methods.


The Numerics of GANs

Mescheder, Lars, Nowozin, Sebastian, Geiger, Andreas

Neural Information Processing Systems

In this paper, we analyze the numerics of common algorithms for training Generative Adversarial Networks (GANs). Using the formalism of smooth two-player games we analyze the associated gradient vector field of GAN training objectives. Our findings suggest that the convergence of current algorithms suffers due to two factors: i) presence of eigenvalues of the Jacobian of the gradient vector field with zero real-part, and ii) eigenvalues with big imaginary part. Using these findings, we design a new algorithm that overcomes some of these limitations and has better convergence properties. Experimentally, we demonstrate its superiority on training common GAN architectures and show convergence on GAN architectures that are known to be notoriously hard to train.